智能论文笔记

Learning Body-Aware 3D Shape Generative Models

Bryce Blinn , Alexander Ding , Daniel Ritchie , R. Kenny Jones , Srinath Sridhar , Manolis Savva

分类：计算机视觉 | 机器学习

2021-12-13

建筑环境中许多物体的形状由他们与人体的关系决定：一个人将如何与这个对象进行互动？ 3D形状的现有数据驱动的生成模型产生合理的物体，但不会理由对人体的那些物体的关系。在本文中，我们学习了3D形状的身体感知生成模型。具体而言，我们培养椅子的生成型号，一种无处不在的形状类别，可以在给定的身体形状或坐姿姿势调节。身体形状调节的型号生产椅子，为具有给定体形的人舒适;姿势调节模型生产适应坐姿的椅子。要训练这些模型，我们定义了“坐姿匹配”度量标准和小说“坐姿舒适”度量。计算这些指标需要昂贵的优化将身体置于椅子上，这太慢被用作用于训练生成模型的损耗功能。因此，我们训练神经网络以有效地近似这些度量。我们使用我们的方法培训三个身体感知生成形状模型：基于结构的零件的发电机，点云发生器和隐式表面发生器。在所有情况下，我们的方法都生产适应其输出椅形状以输入人体规格的型号。

translated by 谷歌翻译

Supervised PCA: A Multiobjective Approach

Alexander Ritchie , Laura Balzano , Daniel Kessler , Chandra S. Sripada , Clayton Scott

分类： (统计)机器学习 | 机器学习

2020-11-10

监督主体组件分析（SPCA）的方法旨在将标签信息纳入主成分分析（PCA），以便提取的功能对于预测感兴趣的任务更有用。SPCA的先前工作主要集中在优化预测误差上，并忽略了提取功能解释的最大化方差的价值。我们为SPCA提出了一种新的方法，该方法共同解决了这两个目标，并从经验上证明我们的方法主导了现有方法，即在预测误差和变异方面都超越了它们的表现。我们的方法可容纳任意监督的学习损失，并通过统计重新制定提供了广义线性模型的新型低级扩展。

translated by 谷歌翻译

Benchmarking common uncertainty estimation methods with histopathological images under domain shift and label noise

Hendrik A. Mehrtens , Alexander Kurz , Tabea-Clara Bucher , Titus J. Brinker

分类：计算机视觉 | 机器学习

2023-01-03

In the past years, deep learning has seen an increase of usage in the domain of histopathological applications. However, while these approaches have shown great potential, in high-risk environments deep learning models need to be able to judge their own uncertainty and be able to reject inputs when there is a significant chance of misclassification. In this work, we conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole-Slide-Images under domain shift using the H\&E stained Camelyon17 breast cancer dataset. Although it is known that histopathological data can be subject to strong domain shift and label noise, to our knowledge this is the first work that compares the most common methods for uncertainty estimation under these aspects. In our experiments, we compare Stochastic Variational Inference, Monte-Carlo Dropout, Deep Ensembles, Test-Time Data Augmentation as well as combinations thereof. We observe that ensembles of methods generally lead to higher accuracies and better calibration and that Test-Time Data Augmentation can be a promising alternative when choosing an appropriate set of augmentations. Across methods, a rejection of the most uncertain tiles leads to a significant increase in classification accuracy on both in-distribution as well as out-of-distribution data. Furthermore, we conduct experiments comparing these methods under varying conditions of label noise. We observe that the border regions of the Camelyon17 dataset are subject to label noise and evaluate the robustness of the included methods against different noise levels. Lastly, we publish our code framework to facilitate further research on uncertainty estimation on histopathological data.

translated by 谷歌翻译

Computational Charisma -- A Brick by Brick Blueprint for Building Charismatic Artificial Intelligence

Björn W. Schuller , Shahin Amiriparian , Anton Batliner , Alexander Gebhard , Maurice Gerzcuk , Vincent Karas , Alexander Kathan , Lennart Seizer , Johanna Löchner

分类：人工智能 | 计算机视觉 | 机器学习

2022-12-31

Charisma is considered as one's ability to attract and potentially also influence others. Clearly, there can be considerable interest from an artificial intelligence's (AI) perspective to provide it with such skill. Beyond, a plethora of use cases opens up for computational measurement of human charisma, such as for tutoring humans in the acquisition of charisma, mediating human-to-human conversation, or identifying charismatic individuals in big social data. A number of models exist that base charisma on various dimensions, often following the idea that charisma is given if someone could and would help others. Examples include influence (could help) and affability (would help) in scientific studies or power (could help), presence, and warmth (both would help) as a popular concept. Modelling high levels in these dimensions for humanoid robots or virtual agents, seems accomplishable. Beyond, also automatic measurement appears quite feasible with the recent advances in the related fields of Affective Computing and Social Signal Processing. Here, we, thereforem present a blueprint for building machines that can appear charismatic, but also analyse the charisma of others. To this end, we first provide the psychological perspective including different models of charisma and behavioural cues of it. We then switch to conversational charisma in spoken language as an exemplary modality that is essential for human-human and human-computer conversations. The computational perspective then deals with the recognition and generation of charismatic behaviour by AI. This includes an overview of the state of play in the field and the aforementioned blueprint. We then name exemplary use cases of computational charismatic skills before switching to ethical aspects and concluding this overview and perspective on building charisma-enabled AI.

translated by 谷歌翻译

Learning 3D Human Pose Estimation from Dozens of Datasets using a Geometry-Aware Autoencoder to Bridge Between Skeleton Formats

István Sárándi , Alexander Hermans , Bastian Leibe

分类：计算机视觉

2022-12-29

Deep learning-based 3D human pose estimation performs best when trained on large amounts of labeled data, making combined learning from many datasets an important research direction. One obstacle to this endeavor are the different skeleton formats provided by different datasets, i.e., they do not label the same set of anatomical landmarks. There is little prior research on how to best supervise one model with such discrepant labels. We show that simply using separate output heads for different skeletons results in inconsistent depth estimates and insufficient information sharing across skeletons. As a remedy, we propose a novel affine-combining autoencoder (ACAE) method to perform dimensionality reduction on the number of landmarks. The discovered latent 3D points capture the redundancy among skeletons, enabling enhanced information sharing when used for consistency regularization. Our approach scales to an extreme multi-dataset regime, where we use 28 3D human pose datasets to supervise one model, which outperforms prior work on a range of benchmarks, including the challenging 3D Poses in the Wild (3DPW) dataset. Our code and models are available for research purposes.

translated by 谷歌翻译

Bayesian Interpolation with Deep Linear Networks

Boris Hanin , Alexander Zlokapa

分类： (统计)机器学习 | 机器学习

2022-12-29

This article concerns Bayesian inference using deep linear networks with output dimension one. In the interpolating (zero noise) regime we show that with Gaussian weight priors and MSE negative log-likelihood loss both the predictive posterior and the Bayesian model evidence can be written in closed form in terms of a class of meromorphic special functions called Meijer-G functions. These results are non-asymptotic and hold for any training dataset, network depth, and hidden layer widths, giving exact solutions to Bayesian interpolation using a deep Gaussian process with a Euclidean covariance at each layer. Through novel asymptotic expansions of Meijer-G functions, a rich new picture of the role of depth emerges. Specifically, we find that the posteriors in deep linear networks with data-independent priors are the same as in shallow networks with evidence maximizing data-dependent priors. In this sense, deep linear networks make provably optimal predictions. We also prove that, starting from data-agnostic priors, Bayesian model evidence in wide networks is only maximized at infinite depth. This gives a principled reason to prefer deeper networks (at least in the linear case). Finally, our results show that with data-agnostic priors a novel notion of effective depth given by \[\#\text{hidden layers}\times\frac{\#\text{training data}}{\text{network width}}\] determines the Bayesian posterior in wide linear networks, giving rigorous new scaling laws for generalization error.

translated by 谷歌翻译

An Optimal Algorithm for Strongly Convex Min-min Optimization

Dmitry Kovalev , Alexander Gasnikov , Grigory Malinovsky

分类：机器学习

2022-12-29

In this paper we study the smooth strongly convex minimization problem $\min_{x}\min_y f(x,y)$. The existing optimal first-order methods require $\mathcal{O}(\sqrt{\max\{\kappa_x,\kappa_y\}} \log 1/\epsilon)$ of computations of both $\nabla_x f(x,y)$ and $\nabla_y f(x,y)$, where $\kappa_x$ and $\kappa_y$ are condition numbers with respect to variable blocks $x$ and $y$. We propose a new algorithm that only requires $\mathcal{O}(\sqrt{\kappa_x} \log 1/\epsilon)$ of computations of $\nabla_x f(x,y)$ and $\mathcal{O}(\sqrt{\kappa_y} \log 1/\epsilon)$ computations of $\nabla_y f(x,y)$. In some applications $\kappa_x \gg \kappa_y$, and computation of $\nabla_y f(x,y)$ is significantly cheaper than computation of $\nabla_x f(x,y)$. In this case, our algorithm substantially outperforms the existing state-of-the-art methods.

translated by 谷歌翻译

Error syntax aware augmentation of feedback comment generation dataset

Nikolay Babakov , Maria Lysyuk , Alexander Shvets , Lilya Kazakova , Alexander Panchenko

分类：自然语言处理

2022-12-29

This paper presents a solution to the GenChal 2022 shared task dedicated to feedback comment generation for writing learning. In terms of this task given a text with an error and a span of the error, a system generates an explanatory note that helps the writer (language learner) to improve their writing skills. Our solution is based on fine-tuning the T5 model on the initial dataset augmented according to syntactical dependencies of the words located within indicated error span. The solution of our team "nigula" obtained second place according to manual evaluation by the organizers.

translated by 谷歌翻译

Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods

Alexander Shevchenko , Kevin Kögler , Hamed Hassani , Marco Mondelli

分类：机器学习 | (统计)机器学习

2022-12-27

Autoencoders are a popular model in many branches of machine learning and lossy data compression. However, their fundamental limits, the performance of gradient methods and the features learnt during optimization remain poorly understood, even in the two-layer setting. In fact, earlier work has considered either linear autoencoders or specific training regimes (leading to vanishing or diverging compression rates). Our paper addresses this gap by focusing on non-linear two-layer autoencoders trained in the challenging proportional regime in which the input dimension scales linearly with the size of the representation. Our results characterize the minimizers of the population risk, and show that such minimizers are achieved by gradient methods; their structure is also unveiled, thus leading to a concise description of the features obtained via training. For the special case of a sign activation function, our analysis establishes the fundamental limits for the lossy compression of Gaussian sources via (shallow) autoencoders. Finally, while the results are proved for Gaussian data, numerical simulations on standard datasets display the universality of the theoretical predictions.

translated by 谷歌翻译

GraphCast: Learning skillful medium-range global weather forecasting

Remi Lam , Alvaro Sanchez-Gonzalez , Matthew Willson , Peter Wirnsberger , Meire Fortunato , Alexander Pritzel , Suman Ravuri , Timo Ewalds , Ferran Alet , Zach Eaton-Rosen

分类：机器学习

2022-12-24

We introduce a machine-learning (ML)-based weather simulator--called "GraphCast"--which outperforms the most accurate deterministic operational medium-range weather forecasting system in the world, as well as all previous ML baselines. GraphCast is an autoregressive model, based on graph neural networks and a novel high-resolution multi-scale mesh representation, which we trained on historical weather data from the European Centre for Medium-Range Weather Forecasts (ECMWF)'s ERA5 reanalysis archive. It can make 10-day forecasts, at 6-hour time intervals, of five surface variables and six atmospheric variables, each at 37 vertical pressure levels, on a 0.25-degree latitude-longitude grid, which corresponds to roughly 25 x 25 kilometer resolution at the equator. Our results show GraphCast is more accurate than ECMWF's deterministic operational forecasting system, HRES, on 90.0% of the 2760 variable and lead time combinations we evaluated. GraphCast also outperforms the most accurate previous ML-based weather forecasting model on 99.2% of the 252 targets it reported. GraphCast can generate a 10-day forecast (35 gigabytes of data) in under 60 seconds on Cloud TPU v4 hardware. Unlike traditional forecasting methods, ML-based forecasting scales well with data: by training on bigger, higher quality, and more recent data, the skill of the forecasts can improve. Together these results represent a key step forward in complementing and improving weather modeling with ML, open new opportunities for fast, accurate forecasting, and help realize the promise of ML-based simulation in the physical sciences.

translated by 谷歌翻译